Evolutionary models accounting for layers of selection in protein-coding genes and their impact on the inference of positive selection.

نویسندگان

  • Nimrod D Rubinstein
  • Adi Doron-Faigenboim
  • Itay Mayrose
  • Tal Pupko
چکیده

The selective forces acting on a protein-coding gene are commonly inferred using evolutionary codon models by contrasting the rate of nonsynonymous substitutions to the rate of synonymous substitutions. These models usually assume that the synonymous substitution rate, Ks, is homogenous across all sites, which is justified if synonymous sites are free from selection. However, a growing body of evidence indicates that the DNA and RNA levels of protein-coding genes are subject to varying degrees of selective constraints due to various biological functions encoded at these levels. In this paper, we develop evolutionary models that account for these layers of selection by allowing for both among-site variability of substitution rates at the DNA/RNA level (which leads to Ks variability among protein-coding sites) and among-site variability of substitution rates at the protein level (Ka variability). These models are constructed so that positive selection is either allowed or not. This enables statistical testing of positive selection when variability at the DNA/RNA substitution rate is accounted for. Using this methodology, we show that variability of the baseline DNA/RNA substitution rate is a widespread phenomenon in coding sequence data of mammalian genomes, most likely reflecting varying degrees of selection at the DNA and RNA levels. Additionally, we use simulations to examine the impact that accounting for the variability of the baseline DNA/RNA substitution rate has on the inference of positive selection. Our results show that ignoring this variability results in a high rate of erroneous positive-selection inference. Our newly developed model, which accounts for this variability, does not suffer from this problem and hence provides a likelihood framework for the inference of positive selection on a background of variability in the baseline DNA/RNA substitution rate.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Phylogenetic Analysis of Three Long Non-coding RNA Genes: AK082072, AK043754 and AK082467

Now, it is clear that protein is just one of the most functional products produced by the eukaryotic genome. Indeed, a major part of the human genome is transcribed to non-coding sequences than to the coding sequence of the protein. In this study, we selected three long non-coding RNAs namely AK082072, AK043754 and AK082467 which show brain expression and local region conservation among vertebr...

متن کامل

The Impact of Contextual Clue Selection on Inference

Linguistic information can be conveyed in the form of speech and written text, but it is the content of the message that is ultimately essential for higher-level processes in language comprehension, such as making inferences and associations between text information and knowledge about the world. Linguistically, inference is the shovel that allows receivers to dig meaning out from the text with...

متن کامل

Selection of suitable reference genes for real-time PCR studies of early developmental stages of sturgeons

In quantitative real-time PCR, the mRNA level can be quantified in relative terms based on the expression ratio of mRNAs of the target gene and an internal reference gene. Since, an internal standard should be expressed at a constant level among different tissues of an organism at all stages of development, and should be unaffected by the experimental treatment, the stability of different refer...

متن کامل

The Impact of Investment Inefficiency and Cash Holding on CEO Turnover

The purpose of this study is to investigate the effective of investment inefficiency and cash holding on CEO turnover. This study applies logistic regression method estimator to investigate the relationship between examine the effective of investment inefficiency and cash holding on CEO turnover of 1,309 firm-year observations in Iran for the period of 2009-2019.  According to positive relati...

متن کامل

A Method for the Simultaneous Estimation of Selection Intensities in Overlapping Genes

Inferring the intensity of positive selection in protein-coding genes is important since it is used to shed light on the process of adaptation. Recently, it has been reported that overlapping genes, which are ubiquitous in all domains of life, seem to exhibit inordinate degrees of positive selection. Here, we present a new method for the simultaneous estimation of selection intensities in overl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Molecular biology and evolution

دوره 28 12  شماره 

صفحات  -

تاریخ انتشار 2011